Direct Regression, Reverse Regression, and Covariance Structure Analysis

نویسندگان

  • CLAES FORNELL
  • YOUJAE YI
چکیده

This paper discusses the issues in estimating the effects of marketing variables with linear models. When the variables are not directly observable, it is well known that direct regression yields biased estimates. Several researchers have recently suggested reverse regression as an alternative procedure. However, it is shown that the reverse regression approach also fails to provide unbiased estimates in general, except for some special cases. It is proposed that covariance structure analysis with an appropriate measurement model can ensure the unbiasedness of estimated effects. These issues are examined in the context of assessing market pioneer advantages. Marketing researchers are often interested in estimating the impact of a certain exogenous variable on the endogenous variable while controlling for the effects of other exogenous variables. For example, one might ask a question: Do market pioneers get higher market shares than equally performing late entrants? A conventional approach to answering this question is to regress a market share variable (M) on a pioneer dummy variable (D = 1 for market pioneers, D = 0 for late entrants) and a performance variable (P).‘-That is, the following linear model is used: M=aD+PP+u (1) where (Y indicates the advantage that pioneers have over late entrants after controlling for the firms’ performance. * The authors thank the editor and the two anonymous reviewers for their helpful comments on the previous version of this paper. 310 CLAES FORNELL, BYONG-DUK RHEE, AND YOUJAE YI An important task is then to estimate the market pioneer advantage (cx). When observations on all variables are available, its estimation is straightforward; direct regression of M on D and P will provide unbiased estimates of (Y. However, estimating the market pioneer advantage is problematic if another predictor affecting the market share (e.g., performance) is unobservable and is replaced with its correlates that are observable. This problem, identified originally in the economics literature on the theory of distributive justice (e.g., Conway and Roberts 1983; Greene 1984), has recently been introduced to marketing researchers (e.g., Vanhonacker and Day 1987). Researchers (Vanhonacker and Day 1987) have identified the case where the usual regression (often-called direct regression) provides biased estimates and in its place suggested an alternative procedure for obtaining an unbiased estimate of cy on the basis of related econometric research (e.g., Goldberger 1984). This estimation procedure is referred to as “reverse regression” since the roles of endogenous and exogenous variables are reversed; that is, exogenous variables (e.g., P) are regressed on endogenous variables (e.g., M). Indeed, reverse regression may yield unbiased estimates under certain circumstances. However, as we show later in this paper, reverse regression fails to provide unbiased estimates in general. Given the widespread use of regression analysis and the difficulty of directly observing variables in marketing, it seems necessary to understand these problems and develop an estimation procedure that can provide unbiased estimates. The purpose of this paper is therefore (1) to investigate the problems in estimating linear regression models with unobservable variables, (2) to show the limitations of reverse regression, a method recently suggested in place of the often-used direct regression, (3) to propose an alternative method that can yield unbiased estimates, and (4) illustrate these alternative procedures in the context of substantive research (i.e., assessing market pioneer advantages). 1. Issues in estimation by direct and reverse regression In this section, we examine the issues in estimating linear models via direct and reverse regression in the context of assessing market pioneer advantages. Research in econometrics (e.g., Goldberger 1984) suggests that unbiased estimation of the coefficient (e.g., ar) for a predictor (e.g., D) depends upon measurement properties of another predictor (e.g., P). Thus, the issues are examined according to the way P is measured. 1 .I. Case I: single measures of P Suppose P is measured with a single indicator with random error. Such a case can be represented by the following equations. REGRESSION AND COVARIANCE STRUCTURE ANALYSIS 311 M=pP+cwD+u X=hP+0 P=cD+& That is, X is an observed indicator of P, and it has the correlation of A with P. Also, the correlation between D and P is c. Several points should be noted with respect to this specification. First, this specification is different from the specification by Goldberger (1984) in that a random error term (u) is added to the equation for M. Goldberger (1984) used a deterministic model without the random error term. However, given a basic and unpredictable element of randomness in market responses, a model with the error term seems to be more justified (see Johnston 1984). Vanhonacker and Day (1987) also used the model containing random error. Furthermore, the model with random error is a general case, because the deterministic model is its special case when random error is zero. Note also that we have added an explicit relation between D and P to the model specification. Specifically, market pioneering is posited to affected firms’ performance. This relation is based on previous research in the area. Several researchers have argued that the order of entry gives the pioneer advantages such as broader product lines, higher product quality, lower production cost, lower advertising cost, etc.; for example, market pioneers can develop and position products for the largest and most lucrative segments and leave the smaller and less desirable market niches for late entrants (Robinson 1988; Robinson and Fornell 1985; Schmalensee 1978). Fershtman, Mahajan and Muller (1990) also provide theoretical support for this relationship. However, the estimation issues and results in this paper hold whether exogenous variables are causally related or merely correlated. Given that exogenous variables are often correlated, the model should be relevant to many research settings. Let us assume that E(&(D)=O E(BJP,D) = 0 E(u(P,D) = 0 V(EID) = u28 V(fJP,D) = 02e V(ulP,D) = uzU,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random regression models for estimation of covariance functions of growth in Iranian Kurdi sheep

Body weight (BW) records (n=11,659) of 4961 Kurdi sheep from 215 sires and 2085 dams were used to estimate the additive genetic, direct and maternal permanent environmental effects on growth from 1 to 300 days of age. The data were collected from 1993 to 2015 at a breeding station in North Khorasan province; Iran. Genetic parameters for growth traits were estimated using random regression test-...

متن کامل

Semiparametric Mean–Covariance Regression Analysis for Longitudinal Data

Efficient estimation of the regression coefficients in longitudinal data analysis requires a correct specification of the covariance structure. Existing approaches usually focus on modeling the mean with specification of certain covariance structures, which may lead to inefficient or biased estimators of parameters in the mean if misspecification occurs. In this article, we propose a data-drive...

متن کامل

Robust Minimax Probability Machine Regression Robust Minimax Probability Machine Regression

We formulate regression as maximizing the minimum probability (Ω) that the true regression function is within ±2 of the regression model. Our framework starts by posing regression as a binary classification problem, such that a solution to this single classification problem directly solves the original regression problem. Minimax probability machine classification (Lanckriet et al., 2002a) is u...

متن کامل

Robust Minimax Probability Machine Regression ; CU-CS-952-03

We formulate regression as maximizing the minimum probability (Ω) that the true regression function is within ±2 of the regression model. Our framework starts by posing regression as a binary classification problem, such that a solution to this single classification problem directly solves the original regression problem. Minimax probability machine classification (Lanckriet et al., 2002a) is u...

متن کامل

Gaussian process functional regression modeling for batch data.

A Gaussian process functional regression model is proposed for the analysis of batch data. Covariance structure and mean structure are considered simultaneously, with the covariance structure modeled by a Gaussian process regression model and the mean structure modeled by a functional regression model. The model allows the inclusion of covariates in both the covariance structure and the mean st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004